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ABSTRACT 

Numerical simulations including magnetic fields have become important in 
many fields of astrophysics. Evolution of magnetic fields by the constrained trans- 
port algorithm preserves magnetic divergence to machine precision, and thus rep- 
resents one preferred method for the inclusion of magnetic fields in simulations. 
We show that constrained transport can be implemented with volume-centered 
fields and hyperresistivity on a high-order finite difference stencil. Addition- 
ally, the finite-difference coefficients can be tuned to enhance high-wavenumber 
resolution. Similar techniques can be used for the interpolations required for 
dealiasing corrections at high wavenumber. Together, these measures yield an 
algorithm with a wavenumber resolution that approaches the theoretical max- 
imum achieved by spectral algorithms. Because this algorithm uses finite dif- 
ferences instead of fast Fourier transforms, it runs faster and isn't restricted to 
periodic boundary conditions. Also, since the finite differences are spatially lo- 
cal, this algorithm is easily scalable to thousands of processors. We demonstrate 
that, for low-Mach-number turbulence, the results agree well with a high-order, 
non-constrained-transport scheme with Poisson divergence cleaning. 
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1. Introduction 

Many astrophysical flows involve dynamically significant magnetic fields, such as molec- 
ular clouds, accretion disks, the galactic dynamo, jets, galaxy clusters, stellar dynamos and 
coronae, the solar wind and the interstellar medium. These problems tend to be three- 
dimensional, multiscale and turbulent, so there is an ongoing interest in developing high- 
resolution and efficient magnetohydrodynamics (MHD) algorithms for them. In this paper, 
we outline an extension of the constrained transport algorithm (Evans & Hawley 1988) to the 
combination of higher spatial order and zone-centered grids, and with resolution-enhanced 
tuned derivatives. We then describe how these measures fit together to yield an algorithm 
that closely approaches the theoretical maximum wavenumber resolution of spectral algo- 
rithms. 



1.1. Constrained transport 

The induction equation for a magnetic field B and a velocity field V in ideal MHD is 

9,B = Vx(Vx B). (1) 

Analytically, this equation conserves magnetic divergence: <9t(V • B) = 0. However, this 
may or may not be the case for a finite-difference treatment of this equation. Toth (2000) 
reviews the methods taken by various algorithms to treat the divergence in MHD simula- 
tions. A spectral code explicitly projects the Fourier components so that B(k) • k = 0. For 
a finite difference code, the magnetic field can be evolved by a constrained transport scheme 
that preserves the magnetic divergence to machine precision (Evans & Hawley 1988). Alter- 
natively, if the discretization doesn't conserve magnetic divergence, the divergence can be 
removed with measures such as periodic use of a Poisson solver (Brackbill & Barnes 1980), 
adding a divergence diffusion term c^B = VV ■ B to the magnetic evolution, or following 
an artificial and independently evolving divergence field (Dedner et al. 2002) to propagate 
divergence away from where it is produced and then dissipate it. The Powell scheme (Powell 
et al. 1999) adds a source term to advect divergence rather than let it grow in place. A finite 
difference code can also employ a vector potential A such that B = V x A, in which case the 
magnetic divergence is automatically zero. This requires the use of a higher-order advection 
algorithm to ensure accurate second-derivatives, as is done in the Pencil code (Brandenburg 
& Dobler 2002). 

We denote any finite difference scheme for MHD that explicitly conserves the magnetic 
divergence to machine precision as constrained transport (CT), and any scheme that does 
not as unconstrained transport (UT). Several variations of CT are possible. If the electric 
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field is differenced as a curl: <9tB = — V x E, then the magnetic divergence is preserved to 
machine precision for most grid types (see Appendix). Evans & Hawley (1988) introduced 
CT for staggered grids and Toth (2000) showed that it works for centered grids as well (see 
the Appendix for explanation of centered and staggered grids). Londrillo & Del Zanna (2000) 
further showed that high-order CT is possible on staggered grids with a radius-two stencil. 
In this paper, we show that volume-centered CT is possible on arbitrarily large stencil sizes, 
with hyperresistivity, and that the resolution of this algorithm at moderately high order 
approaches the theoretical maximum exhibited by a spectral code. 

In § (2j we discuss the specifics of the algorithm, and in § [3], we describe test simulations 
that demonstrate the capabilities of CT with high-order spatial derivatives. 

2. Algorithms 

Our algorithm is based on a constrained transport scheme, plus measures to enhance the 
resolution and maintain stability. High wavenumber resolution is achieved by a combination 
of high-order and tuned finite differences plus hyperdiffusivity, and stability is achieved by 
Runge-Kutta timestepping and hyperdiffusivity. 

2.1. Timestepping 

A high-order timestepping scheme for the evolution equations is essential for the stability 
of most algorithms. The time update for a variable Q(t) is Q(At) = Q(0) + AtQ*, where 
Q* represents some estimate of f Q At d t Q(t)dt. One example is the second-order Runge-Kutta 
scheme, which estimates Q(At/2) « Q(0) + (At/2)Q'(0), and then identifies Q* = Q'(At/2). 
Another class of algorithms maintains the conservation of mass and momentum by computing 
fluxes through zone boundaries. A variety of techniques exist for time-extrapolating the 
fluxes at t — At/2, such as piecewise parabolic advection (Colella and Woodward 1984), 
Total Variation Diminishing (Harten 1983), Riemann solvers (Toro 1999), the method of 
characteristics (Stone & Norman 1992a, Hawley & Stone 1995), and many more. MHD poses 
a challenge to time extrapolation because there are seven or eight wavemode characteristics, 
depending on the technique used for treating magnetic divergence. In particular, the well- 
known method-of-characteristics algorithm interpolates along the Alfven characteristic while 
neglecting the fast and slow mode characteristics. For our simulations, we use Runge-Kutta 
for time-extrapolation because it doesn't invoke any diffusive spatial interpolations (§ 12. 2p . 
and because it automatically captures all three MHD wavemode types. In our demonstration 
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implementation, we use second-order Runge-Kutta while the Pencil code (Brandenburg & 
Dobler 2002) uses third-order, although either order has proven successful. 

2.2. Diffusivity 

A common class of algorithms is based on momentum fluxes that are time-extrapolated 
with upwind spatial interpolations. The errors from the interpolations required for these flux 
transport algorithms produce an intrinsic diffusivity that can stabilize the evolution, even in 
the absence of any explicit diffusive terms. The nature and magnitude of the diffusivity has 
been characterized in Zhong (1998) and Dobler et al (2006). 

Runge-Kutta timestepping, on the other hand, has no spatial interpolations, and thus 
no intrinsic diffusivity. One then generally needs an explicit stabilizing diffusivity. One has 
various options for the form of this diffusivity, with Laplacian or hyper-Laplacian typically 
chosen. These diffusivities have the benefit that their magnitude is easily characterized, 
and the diffusive coefficient can be tuned to have the minimum value necessary to preserve 
stability. Consider: 

d t y = z/ 2 v 2 v - z/ 4 v 4 v + z/ 6 v 6 v - 1/ [4] v [4] v. (2) 

Let the Fourier components be V(k), where k = {k x , k y , k z } is the wavenumber. They evolve 
according to 

$V(k) = -z/ 2 k 2 V(k) - z/ 4 k 4 V(k) - z/ 6 k 6 V(k). - v [4] (k x + ky + kj)V(k). (3) 

The v 2 term is the Laplacian viscosity and the others are higher-order hyperdiffusivities. 
Specifically, V 4 = V 2 V 2 , V 6 = V 2 V 2 V 2 , and V [4] = <9 4 + <9 4 + <9 4 . They differ in that the 
higher the order, the more selective they are in diffusing high-k structure while preserving 
low-k structure. Equivalently stated, high-order hyperdiffusivities erase small-scale structure 
without affecting the larger scales. For many physical applications, such as the turbulent 
magnetic dynamo, the large scale structure is unaffected by the nature of the small-scale dif- 
fusivity (Haugen & Brandenburg 2004), and so in these instances, the use of hyper diffusivity 
instead of Laplacian diffusivity enhances the wavenumber resolution. 

The V 4 operator is spherically symmetric in Fourier space while the V' 4 ^ operator is 
not. This affects the maximum-possible timestep because in order to be advectively stable, 
the Courant condition implies that the product \k\At must be less than a given value, and 
so the high-k corners of the 3D Fourier cube are the most vulnerable to advective instability. 
In these corners, the V 4 term delivers more diffusion than the term, but with a cost of 
twice as many finite difference operations. 
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Hyperdiffusivity acts together with high-order finite differences to enhance the resolution 
of a simulation. High-order finite differences allows structure to be finite differenced with 
less error, and hyperdiffusivity allows this structure to evolve with less dissipation than it 
would with Laplacian diffusivity. 

Hyperdiffusivity can benefit the timestep as well as the resolution. Suppose one chooses 
the Laplacian term to provide the dominant fraction of the diffusivity. One typically then 
evaluates by trial and error the minimum value of v 2 and the maximum timestep At that one 
can get away with to maintain stability. Once these values are chosen, the addition of a small 
measure of hyperdiffusivity, small enough so as not to contribute significantly to the total 
diffusivity, but large enough to affect the highest-k structure, increases the maximum stable 
timestep by about 50%. This technique works because the highest-wavenumber structure is 
the most vulnerable to instability, and so the timestep depends most critically on the value 
of the diffusivity for these wavenumbers. 

One could additionally note that "maximum stable timestep" is not sharply defined. 
For instance, a timestep at the cusp of stability might be unstable, but only after perhaps 
more than 10 timesteps. Also, the maximum stable timestep depends on the maximum value 
of the velocity in the simulation, which is changing in time. 



2.3. Magnetohydrodynamic equations 

The equations of incompressible MHD with diffusivity and hyperdiffusivity are 
d t \ = -V- VV + B-VB + z/ 2 V 2 V-i/ 4 V 4 V-i/ [4] V [4] V + i/ 6 V 6 V + z/ [6] V [6] V + z/ d VV-V (4) 



&B = V x 



V x B - ?? 2 J + r/ 4 V 2 J - r/ 6 V 4 J - r? 2 , [4] V [ 4]J + r] D VV ■ B (5) 
J = V x B (6) 



The variables are defined in Table (OQ). The magnetic equation (J5j) is written as a curl 
so that the contrained transport scheme can preserve the magnetic divergence (Appendix). 
The remaining term proportional to rj D outside of the curl has the effect of diffusing away 
any magnetic divergence present, and does nothing otherwise (§ 12.41) . Similarly, the term 
proportional to u D term can be used to help quell kinetic divergence in incompressible simu- 
lations, although for the simulations in this paper we use a spectral projection to accomplish 
this. 

We clarify the meaning of equation (J5J), by assuming that V • B = : 



d t B = B VV — V VB — B V V 



- 6- 



+ r/ 2 V 2 B - r] 4 V 4 B + r/ 6 V 6 B + r/ 2 , [4] V 2 V [4] B + ^VV • B (7) 

The J terms in equation ([5]) are seen to have the role of diffusivities in equation (JTj). In 
the simulations in § [21 we use equation (j^D for constrained transport and equation (0) for 
unconstrained transport. 

Time centering of the staggered-grid and centered-grid constrained transport equations 
encounter a similar circumstance as with the kinetic equation. In the staggered-grid con- 
strained transport configuration employed by Hawley and Stone (1995), the electric fields are 
spatially interpolated from the nearest 8 velocity and magnetic field vectors, and they are 
also time-interpolated to the next half step with an Alfven wave Method of Characteristics 
scheme (Hawley and Stone 1995). The implicit diffusivity inherent in these interpolations 
stabilizes the magnetic field evolution, even in the absence of explicit diffusivity. For Runge- 
Kutta timestepping on a centered grid, there are no spatial interpolations in constructing 
the electric field, and no accompanying intrinsic diffusivity. Some form of diffusivity is then 
generally required to maintain stability. 



2.4. Divergence 

In a constrained transport simulation, the magnetic divergence remains zero to machine 
precision. For unconstrained transport, we set the magnetic divergence to zero with a Poisson 
projection in Fourier space once every four timesteps. In tests, we found that the evolution is 
virtually identical whether the divergence is removed once every timestep or once every four 
timesteps (Maron 2004), and in both cases the fractional magnetic divergence remains below 
one percent. For both constrained and unconstrained transport simulations of incompressible 
flow that we describe below, the kinetic divergence is removed once every timestep in Fourier 
space, although for quasi-incompressible flow, it suffices to do this only once every few 
timesteps (Maron 2004). 

The divergence diffusivity term <9 t B = i] D VV • B is helpful for reducing the effect of 
magnetic divergence in the timesteps between Poisson projections. It diffuses any magnetic 
divergence present, but otherwise does nothing. If each Fourier component has the form 
B(k) = B||k/|k| + Bj_, then the VV • B term evolves B as dtB\\ = ~^ 2 B|| and d t B ± = 0. 
The divergence diffusivity is also helpful for a constrained transport simulation. Without it, 
CT preserves the magnetic divergence but does nothing to remove it. With the divergence 
diffusivity, any divergence present is diffused away. 
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2.5. Initialization 

The constrained transport algorithm evolves the magnetic field in such a way as to 
conserve magnetic divergence. If the initial conditions have zero divergence, the divergence 
remains zero indefinitely. Even if any monopoles do grow slowly, they can be removed 
at negligible computational expense with a Fourier projection, say, once every thousand 
timesteps.Care must be taken with the initial conditions, however, because the divergence 
depends on the method for approximately evaluating derivatives. One may use, for example, 
a spectral or a finite difference divergence operator. For constrained transport to work, the 
derivatives used for making the initial conditions divergenceless must be the same as those 
used in the simulation. Even an analytic function with vanishing divergence may not have 
vanishing numerical divergence. 

To initialize the magnetic field, we apply the following procedure. In three dimensions, 
let the wavenumbers for the magnetic field Fourier components B be k, the length of each 
side of the simulation volume be L, and the number of grid zones on each side be N. If 
the magnetic divergence is defined spectrally, then the constraint on the magnetic field is 
B(k) • k = 0. For a finite difference derivative the constraint is slightly different. If we use 
the high-order finite difference from § 12.61 (eq. [TT]) . we can define an adjusted dimensionless 
wavenumber by 

s 

k* = mjsmihLij/Ni), (8) 

3=S 

and then the finite difference condition for zero divergence is 

B(k) ■ k* = 0. (9) 

The initial conditions for a constrained transport simulation should satisfy this condition, 
which is easily implemented in Fourier space. 



2.6. High-wavenumber finite differences 

High-order spatial derivatives can enhance the wavenumber resolution of a simulation. 
To quantify this, define a function fj(xj) on a periodic grid Xj = j, with j an integer. Then 
construct a finite difference derivative /'(0) at x = using a radius- S stencil. The familiar 
result for the gradient on a radius-one stencil is /'(0) ~ (f\ — /_i)/2, which is obtained from 
fitting a polynomial of order 2 to fj at j = 0. For an order 4 polynomial on a radius- two 
stencil, 

/'(0) ~ i/- a - + \h - (10) 
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For a stencil of order S, 



s 



no) ~ Yl m if> 



(ii) 



j=-s 



where m_j = —rrij. The coefficients for a radius-three stencil are {mi, 7712, 7713} = {3/4,3/20,1/60}. 

Consider the finite-difference error at x = for a Fourier mode sin(7rfcx). (Cosine 
modes can be ignored because they don't contribute to the derivative at x = 0.) We scale 
the wavenumber k to grid units so that k = 1 corresponds to the maximum (Nyquist) 
wavenumber expressible on the grid. The finite difference formula (eq. [TT1) gives 



whereas the correct value is krr. The spectral procedure of taking the derivative by trans- 
forming to Fourier space and back gives the correct value up to k = 1, but at a cost of 
5 log 2 iV floating point operations per grid point per transform, where N is the number of 
grid points, whereas the finite difference derivative (equation [TT1) on a radius S stencil costs 
35—1 floating point operations. Figure ([1]) exhibits the accuracy of finite difference deriva- 
tives of different orders as a function of wavenumber. The wavenumber resolution increases 
with polynomial order. 

The polynomial fit can be tuned by adjusting the coefficients to enhance high-wavenumber 
accuracy substantially at the expense of a negligible loss of accuracy at low wavenumbers 
(Maron 2004). Table [2] gives the coefficients for finite difference operators at various orders, 
along with the maximum wavenumber for which they are 0.5% accurate. In Figured] we also 
show the wavenumber accuracy for a radius-three stencil using tuned coefficients, designed to 
have a relative precision of < 0.5% for k = to 0.50. We will see in § 13. 1.21 that a simulation 
based on this tuned radius-three scheme performs better than a radius-four simulation with 
conventional coefficients. We note that power spectra do not reveal a difference between 
turbulence simulations with tuned and untuned coefficients, because errors in the derivative 
manifest as an advective dispersion rather than as a diffusivity. One has instead to examine 
the fields in real space rather than in Fourier space. 



Nonlinear terms in the MHD equations such as V ■ W suffer an aliasing error for high- 
wavenumber structure. We first illustrate this with a ID example before treating the 3D 
case (Canuto et al. 1987). Let Aj and Bj be discrete functions on a one-dimensional grid 



s 




(12) 



2.7. Dealiasing 
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V 


Velocity 


B 


Magnetic field 






J 


Current (V x B) 


v 2 


Laplacian Viscosity 


V2 


Laplacian resistivity 


v 4 


Hyperviscosity for V 4 


V* 


Hyperresistivity for V 4 


v w 


Hyperviscosity for V' 4 ' 


%] 


Hyperresistivity for V' 4 ' 




Hyperviscosity for V 6 


V* 


Hyperresistivity for V 6 


V 2 , [4] 


Hyperviscosity for V 2 V' 4 ' 


Vz, [4] 


Hyperresistivity for V 2 V' 4 ^ 


v m 


Hyperviscosity for V' 6 ' 


%] 


Hyperresistivity for V^ 6 ' 




Divergence viscosity 


Vd 


Divergence resistivity 



Table 1: Variables in the equations of MHD 



Operation mo mi mi m% m^ 777,5 m % m 7 m 8 h 



d/dx (PI) 








50000 






















o.r. 


d/dx (P2) 








66667 


-0 


08333 


















0.2^ 


d/dx (P3) 








75000 


-0 


15000 





01667 














0.3^ 


d/dx (P4) 








80000 


-0 


20000 





03810 


-0 


00357 










0.4( 


d/dx (P5) 








83333 


-0 


23810 





05952 


-0 


00992 





00079 






0.4^ 


d/dx (P8) 








88889 


-0 


31111 





11313 


-0 


03535 





00870 


-0.00155 0.00018 


-0.00001 


0.5( 


d/dx (T3) 








81796 


-0 


21324 





03683 














0.5( 


d/dx (T8) 








95951 


-0 


42312 





22746 


-0 


12450 





06461 


-0.03004 0.01156 


-0.00273 


0.7( 


d/dx (T8) 








96685 


-0 


43635 





24410 


-0 


14168 





07976 


-0.04148 0.01884 


-0.00536 


0.8( 


d 2 /dx 2 (P3) 


-2.72222 




1.5 




-0.15 





01111 
















9 4 /^ 4 (P3) 


9.33333 




-6.5 




2. 


-0 


16667 
















d 6 /dx 6 (P3) 


-20. 




15. 




-6. 




1. 
















Interpolation (P3) 








58594 


-0 


09766 





01172 














0.31 


Interpolation (T3) 








60103 


-0 


12312 





02214 














0.5( 


Interpolation (T8) 








63099 


-0 


19580 





10149 


-0 


05770 





03248 


-0.01709 0.00793 


-0.00233 


0.8( 



Table 2: Coefficients for finite difference operations. The column u k max n indicates the 
maximum wavenumber for which the operator is no worse than 0.5% accurate. For the odd- 
order derivatives, m_j = —rrij, and for the even-order derivatives as well as the interpolations, 
m_j = my The interpolation coefficients are for interpolation to a point halfway between 
two grid zones using the nearest S grid points on each side (§ 12.8ft 
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Fig. 1. — The accuracy of finite-difference derivatives as a function of the Nyquist-scaled 
wavenumber k. "Polynomial-N" denotes a radius-N stencil with polynomial-based coeffi- 
cients, and "Tuned-N" denotes a radius-N stencil with tuned coefficients (§ 12.61 and Table [2]). 
The "Exact" curve is the derivative as evaluated by a Fourier transform. A radius-N stencil 
with polynomial-based coefficients is accurate to order 2N. 
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with N = 1Q grid points: 1 < j < N, and with accompanying Fourier modes ranging from 
—N/2 < s < (N/2) — 1. The Fourier expansion of A is 

N/2-1 



and similarly for Bj. Let A and B each be composed of a single Fourier mode with s = 6. 
The product C = AB is then composed of a single Fourier mode with s = 12, but in this 
discrete representation, this is equivalent to s = —4. This remapping of high wavenumber 
modes to low wavenumber modes is known as aliasing error. Note that s = 6 is within the 
Nyquist limit of \s\ < N/2 while the product mode s = 12 is not. While A, and Bj might 
be resolvable on the grid, their product need not be. This can be fixed by truncating before 
and after the product all modes outside \s\ < N/3 to ensure that no modes in the product 
will exceed the Nyquist limit. A spectral code can make this truncation because the fields 
are in Fourier space, while a finite difference code cannot. As a practical matter it suffices to 
set the diffusivities high enough so that negligible structure exists outside the N/3 aliasing 
limit. 

Alternatively, the aliasing problem can be remedied with a staggered-grid correction 
(Canuto et al. 1987), and then the truncation isn't necessary. Additionally, this allows 
us to simulate structure beyond k = 2/3. We first exhibit this procedure in ID and then 
extend it to 3D below. To implement the staggered-grid correction, first construct the usual 
product Cj = AjBj. Then interpolate the centered grids Aj and Bj to the staggered grids 
Aj+1/2 and Bj + y 2 , multiply to produce Cj+1/2 and transform back to the centered grid Cj. 
Finally, combine the centered and staggered results as Cj <— (Cj + Cj)/2. This result is free 
from aliasing error for all Fourier modes, and so no Fourier-space truncation is necessary. 
In Fourier space, the fields on the staggered grid can be computed exactly by applying a 
phase shift to each Fourier mode. A finite difference scheme can accomplish this with the 
high-order interpolation discussed in § 12.81 

In three dimensions the aliasing error can be eliminated by truncating the Fourier modes 
outside |s| < N/3, where s = {s x ,s y ,s z } and N = {N x , N y , N z }. The grid-shift correction 
is more complicated. Seven grid shifts are needed to completely correct the error, but it is 
more efficient to settle for a more limited correction involving just one shift (Canuto et al. 
1987). Interpolate A and B to the staggered grid {j x + 1/2, j y + 1/2, j z + 1/2}, multiply, 
return to the centered grid {j x ,j y ,jz}, and as in the one-dimensional case, average the result 
with the product carried out on the original centered grid. This yields an alias-free result if 
accompanied by a Fourier truncation of all modes outside |s| < \/3N/2, which is 94 % of the 
Nyquist limit. For a finite difference code, the diffusivity can be set to minimize the energy 
outside the truncation zone. When the three-dimensional, staggered-grid, aliasing correction 




(13) 



s=-N/2 



-12- 



is combined with high-order, volume-centered, constrained transport, magnetic divergence 
is still preserved to machine precision. This scheme uses the high-order finite differences and 
interpolations discussed in § 12.61 and § 12.81 

2.8. Interpolation 

High-order interpolation allows implementation of the staggered aliasing correction dis- 
cussed in § 12.71 It can also be used for doubling the grid size. In this case, the new points in 
the doubled three-dimensional grid can be generated with a set of interpolations along and 
diagonal to the three grid axes. Doubling the grid is also useful for interpolating to points 
that are not exactly halfway between grid points. For example, the grid can be doubled with 
the high-order interpolation discussed above, and then a simpler algorithm can be used to 
further interpolate to a point anywhere within the refined grid. We used this technique to 
trace the path of magnetic fieldlines in our studies of cosmic ray diffusion (Maron 2003), 
and also for doubling the resolution of turbulent dynamo simulations in Maron et al. (2004). 
Lastly, we have found that high-order tuned derivatives and interpolations are useful for 
post-simulation analysis of timeslice data without the need for computationally expensive 
fast Fourier transforms. 

A high-order interpolation to a point halfway between two grid points can be accom- 
plished with 

s 

/i+i/2 = Yl m i(fi+i + /i+i-i)- ( 14 ) 
1=1 

The error in the interpolation as a function of wavenumber can be calculated by apply- 
ing equation (fill) to a cosine mode centered on fj+1/2' f% = cos(irk(i — (j + 1/2))). The 
interpolated value is 

s 

fj+1/2 = Yl m i cos ( nik ) ( 15 ) 
i=i 

whereas the correct value is fj+1/2 — 1- Table [2] gives coefficients for an interpolation based 
on a radius-three stencil from a polynomial fit to fj, in the row labeled "Interpolation (P3)." 
Tuning the coefficients can improve the wavenumber resolution of the interpolation in a 
similar manner as was done for the derivative. Tuned coefficients for radius 3 and 8 stencils 
are given in Table ([2]) in the rows labeled T3 and T8, with the column labeled k max giving 
the maximum wavenumber for which the interpolation is 0.5% accurate. 
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2.9. Operation count 

A finite difference derivative on a radius 5 stencil costs 5 multiplies and 25—1 adds. 
Since add and multiply units tend to come in pairs on most modern machines, this is ef- 
fectively 45 — 2 floating point operations per grid cell. A code with F finite difference 
convolutions per timestep involves F(AS — 2) floating point operations. 

The computation cost per timestep scales as the number of derivatives computed per grid 
element per time update. For constrained transport, one needs to calculate three derivatives 
for dip, three for d^e, nine for djW j, six for V x B and six for V x E. Diffusion terms 
such as V 2 V and V 2 J aren't included in the tally because they need only be applied every 
few timesteps (Maron 2004). The same is true for the Fourier-space Poisson projection to 
make the fields divergenceless. Maron (2004) found that the results were effectively identical 
whether these terms were applied once every timestep or once every four timesteps. 

In Table E] we list the number of derivatives computed per timestep for various classes of 
algorithms. Pencil is a vector potential code (Brandenburg & Dobler 2002) that calculates 
J directly from A to take advantage of the cache memory. This involves 15 distinct 2-level 
derivatives computed from a 2D stencil. For a radius-S finite difference, a 2D derivative 
involves 45 2 — 1 adds, compared to 25" — 1 adds for a conventional ID derivative. We don't 
consider multiples because they are always less numerous than adds, and because CPUs tend 
to feature add and multiply units in pairs. As a result, a 2D derivative corresponds to 25+ 1 
times the computational effort of a conventional ID derivative. Thus, the calculation of J 
with 5 = 3 corresponds effectively to about 105 conventional derivatives. Alternatively, a 
vector potential algorithm can instead calculates B from A and then J from B in separate 
stages, at a possible cost of extra memory latency. In table [31 we denote a vector potential 
algorithm that calculates J directly from A as "1-stage," and a vector potential algorithm 
that calculates B from A and then J from B as "2-stage." The constrained transport algo- 
rithm evolves V and B according to equations (jlj) and ([5]), and the unconstrained transport 
algorithm evolves V and B according to equations (jlj) and (J7J). The spectral algorithm com- 
putes 15 Fourier transforms instead of derivatives and thus doesn't directly compare with 
the finite difference codes. 

A vector potential code such as Pencil can be adapted to the algorithm described here 
by substituting equation ([5]) for the vector potential equation, and by changing the finite dif- 
ference coefficients from the polynomial-based values to the tuned values. This allows one to 
compute boundary conditions using the magnetic field instead of the vector potential. How- 
ever, there is an extra round of inter-processor communication associated with calculating J 
from B and then applying the curl. 
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A convenient unit for execution speed is Kilo-grid Elements per CPU GigaFLOPS per 
Second (KEGS). A constrained transport code with a radius-three stencil and two Runge- 
Kutta steps per timestep would run ideally at a speed of 1850 KEGS. The actual speed is 
slower of course because of overhead and because the finite difference convolutions don't run 
at the peak floating point speed. In benchmarks, we have observed that the Pencil code, 
which has 3 Runge-Kutta stages, runs at a speed of around 100 KEGS for serial operation 
and 

50 KEGS for massively parallel operation. For comparison, the highly-optimized spectral- 
MHD code Tulku (Maron & Goldreich 2001), has a speed of around 80 KEGS in serial and 
40 KEGS in parallel (Maron 2004). 

3. Simulations 

3.1. The turbulent nonhelical dynamo 

Our first test model is the turbulent, nonhelical, MHD dynamo (Batchelor 1950), which 
is the magnetic analogue of the Kolmogorov cascade for hydrodynamic turbulence. The 
Kolmogorov cascade is the long-term, steady state of hydrodynamic turbulence that is forced 
at the large scale and dissipated by viscosity at the small scale, and it has an energy spectrum 
of E{k) ~ k~ 5 ^ 3 . The nonhelical MHD dynamo has the same setup but includes a spatially 
homogeneous magnetic field with unit magnitude and zero mean. Maron et al. (2004) found 
that the steady state of this system has a kinetic energy spectrum of E{k) oc k~ 2 , where 
the energy is dominantly in large-scale eddys, and a magnetic spectrum with the energy 
predominantly in the smallest-scale (resistive-scale) magnetic structures. The kinetic and 
magnetic spectra are shown in figure [2j 



Algorithm Terms Derivatives 

Constrained transport dip, die, d{Vj, V x B, V x E 27 

Unconstrained transport dip, die, diVj, diBj, <9j(V • B) 30 

2-stage vector potential dip, die, diVj, V x A, V x B 27 

1-stage vector potential (Pencil) dip, die, diVj, VxA, VxVxA 120 

Spectral 15 Fourier transforms N/A 



Table 3: The number of derivatives computed per time update (see § 12. 9ft . 



- 15 - 




Fig. 2.— (a) The kinetic (V) and magnetic (B) spectra for a set of simulations with 
identical initial conditions, after two crossing times, (b) The value of B y along a grid line 
parallel to the a;-axis after 0.4 crossing times, for the same set of simulations. The high- 
order constrained and unconstrained transport simulations (CT3, CT8 & UT8) more closely 
resemble each other than they do the low-order constrained transport simulations (CTl & 
CT2). 
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3.1.1. Models 

We ran a set of simulations of forced homogeneous MHD turbulence (Table H|) with 
and without constrained transport and at various spatial orders to test the effectiveness 
of high-order constrained transport. Each simulation has the same timestep, viscosity and 
resistivity. The grid used in all cases has 64 3 zones covering a periodic unit cube. The forcing 
power, density, initial RMS velocity and magnetic field are all unity, and they remain so in 
the long-term steady state. 

The initial state is the same for all simulations. It was taken from a simulation of forced 
magnetized turbulence (§ 13.11) in the long-term steady state. The magnetic divergence is zero 
in Fourier space: B(k) ■ k = 0. The initial state was modified slightly for the constrained 
transport simulations to make it magnetically divergenceless according to the finite-difference 
derivative (§ 12.51) . This amounts to a very small adjustment in the fields, much less than the 
difference between the fields after a crossing time of evolution. 

Constrained transport needs some form of diffusivity, either Laplacian or hyperdiffu- 
sivity, to maintain stability. With diffusivity, it is stable indefinitely, whereas it is unstable 
without it. To establish this, we evolved the MHD equations with constrained transport 
for ten crossing times stably. The diffusivities used in these models are Laplacian viscosity 
u 2 = 1CT 3 , hyperviscosity is [4] = 2.5 x 1CT 8 and hyperresistivity r/ 2 :1 = 7 x 1CT 12 , which we 
empirically find are the minimum values that maintain stability for V RMS = B RMS = 1 on a 
64 3 grid. Expressed in dimensionless form: V n = b / n /(V RMS dx n ~ 1 ) and fj n = r\ n j ' (y RMS dx n ~ x \ 
these are z7 2 = .064, fj [4] = 6.6 • 1CT 3 and fj 2 , 4] = 7.5 • 1CT 3 . 

The forcing is the same as used by Maron et al. (2004). A random forcing field is 
added to the velocity every timestep. The spectrum of the forcing field is k~ 5 ^ 3 , truncated 
2.5 lattice units from the origin in Fourier space, and the Fourier components have random 
phases. The forcing power, simulation volume and density are unity, which yields RMS 
velocity and magnetic fields of order unity (Maron 2004). 

3.1.2. Results 

The diffusivities are given in § [31 We plot the kinetic and magnetic spectra in figure W(a>)- 
The spectra are very similar for constrained and unconstrained transport simulations, and 
also for different orders. However the spectra alone do not distinguish between simulations of 
different orders because an error in the derivative manifests itself as an advective dispersion 
rather than as a diffusivity (§ 12.61) . One instead has to examine the fields in real space. 
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Tag 


Method 


Stencil 


Finite difference 


h 


L 2 norm 






radius 


technique 






CT1 


Constrained transport 


1 


Polynomial 


0.12 


0.561 


CT2 


Constrained transport 


2 


Polynomial 


0.24 


0.193 


CT3 


Constrained transport 


3 


Polynomial 


0.34 


0.083 


CT4 


Constrained transport 


4 


Polynomial 


0.40 


0.042 


CT5 


Constrained transport 


5 


Polynomial 


0.44 


0.023 


CT8 


Constrained transport 


8 


Polynomial 


0.56 





CT3tl 


Constrained transport 


3 


Tuned 


0.50 


0.034 


CT3t2 


Constrained transport 


3 


Tuned 


0.52 


0.037 


UT8 


Unconstrained transport 


8 


Polynomial 


0.56 


0.267 



Table 4: Index of simulations. CT and UT denote constrained and unconstrained transport. 
The column labeled "finite difference technique" specifies whether the finite difference coef- 
ficients are from a polynomial fit or if they have been tuned to enhance high-wavenumber 
accuracy (§ I2.6p . k max denotes the largest wavenumber for which derivatives are 0.5% accu- 
rate. The L 2 norms are taken with respect to UT8, as discussed in § 13.1.21 The simulations 
with tuned coefficients are CT3tl, which has k max set to 0.50, and CT3t2, which has k max 
set to 0.52. 
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In Figure ®(b), we compare the magnetic fields at t=0.4 crossing times. For the com- 
parison, we examine the difference between the fields integrated over space by computing 
the L 2 norm between simulations i and j : 

,2 f[By(i) - By(j)MV0l) 

ij J B y {i) 2 <No\ + J B y (j) 2 d(Vo\y { ' 

Stone et. al. (1992b) argue that this kind of comparison is more meaningful than merely plot- 
ting the overlay of both fields. The constrained transport simulation with polynomial-based 
finite differences on a radius-eight stencil (CT8 in Table H]) serves as the basis of comparison. 
We compare the constrained transport simulations to an unconstrained transport simulation 
on a radius-eight tuned finite-difference stencil (UT8). We use UT8 as a stand-in for the 
spectral algorithm because of its high wavenumber resolution. 

The spectral algorithm delivers the highest-attainable resolution because spectral deriva- 
tives are exact for all wavenumbers. With this, a 3D spectral simulation without an aliasing 
grid-shift correction can resolve structure up to k=2/3, and with a grid-shift correction it can 
resolve up to k=.94 (Canuto 1987). The spectral algorithm can also set the magnetic diver- 
gence to zero in Fourier space at negligible cost. Unconstrained transport does not explicitly 
conserve magnetic divergence and so in model UT8 the divergence is cleaned with a Fourier 
projection every timestep. We also tried applying the correction every fourth timestep and 
with virtually identical results. The radius-eight stencil of UT8 yields derivatives that are 
accurate up to k=.56. 

The L 2 norms given in Table show how the simulations progressively approach the 
CT8 result as the stencil size increases. The match is poor for CT1 and better for CT3. We 
also note that the radius-three simulation with tuned derivatives (CT3t) performs better than 
the radius-four simulation with polynomial-based derivatives, establishing the effectiveness 
of tuned derivatives. This can also be qualitatively seen in Figure (j2j), where we see that 
the fields for CT8 and UT8 are closely aligned (Fig. [2]), and that they also closely resemble 
those for CT3. 

We attribute the remaining differences between CT8 and UT8 to the fact that the mag- 
netic divergence is removed spectrally in UT8, while it is handled by constrained transport 
in CT8. Collectively, the high-order constrained and unconstrained transport simulations 
(CT3, CT8 & UT8) more closely resemble each other than they do the low-order constrained 
transport simulations (CT1 & CT2). We conclude that CT3 is already a good approximation 
to the spectral algorithm. 
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3.2. Comparison of the constrained transport and vector potential techniques 

We adapted the vector potential code Pencil to run in CT mode and used it to compare 
the vector potential and CT techniques. We ran an Alfven wave on a 32 3 grid with zero 
viscosity and resistivity. (Figure [3]). After ten crossing times, both the vector potential and 
CT techniques yield wave profiles that agree with each other to within 1 percent. The shape 
of the profiles are also well-matched with the initial conditions, with a phase error of 10 
percent. 

We also used both techniques to run a turbulent dynamo simulation We started with 
an initially weak magnetic field in the form of a Beltrami wave and applied helical forcing 
until it grew to a steady state. The box size is (2ir) 3 , the density is unity, the forcing power 
is equal to 0.07, the viscosity is equal to 5 • 10~ 3 , and the resistivity is equal to 5 • 10~ 3 . The 
RMS magnetic field strength is plotted in Figure (j3J). After 30 crossing times, the values for 
Brms f° r the CT and vector potential techniques agree to 1 percent (Figure H]). 

3.3. Oblique Alfven wave test 

We ran an Alfven wave test where the propagation axis is oblique to the grid axes, with 
the initial conditions in Gardiner & Stone (2005): 

V = {0, 0.1sin(27rx + 4tt?/), 0.1 cos(2vrx + 4th/)} (17) 

B = {1, 0.1sin(27rx + 4vn/), 0.1 cos(2vrx + Any)} (18) 

The simulation volume is a unit cube, modeled on a grid of size 16 3 . The velocity field is 
quasi-incompressible, with the divergence removed spectrally every 4 timesteps. The kinetic 
and magnetic diffusivities are all set to zero for this linear problem. We ran two simulations: 
one with third-order polynomial finite differences and another with third-order tuned finite 
differences from Table (j2J). After the wave has traveled 16 times around the periodic box, the 
waveform remains almost indistinguishable from the initial conditions, with the tuned finite 
differences yielding a more precise result than the polynomial finite differences (figure EJ). 

4. Summary 

We have developed a new version of the constrained transport algorithm that uses 
volume-centered fields and hyperresistivity on a high-order finite difference stencil, with 
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tuned finite difference coefficients to enhance high-wavenumber resolution. High-order in- 
terpolation allows implementation of staggered dealiasing. Together, these measures yield a 
wavenumber resolution that approaches the ideal value achieved by the spectral algorithm. 

Volume centered fields are desirable because then V, B and E all reside at the same grid 
location, allowing E to be constructed directly from the cross product of V and B without 
interpolation. For staggered fields, V and B reside at the zone faces and E on the edges, 
and so constructing E involves spatial interpolation, which reduces wavenumber resolution. 

High-order stencils and tuned finite difference coefficients both enhance the wavenumber 
resolution of finite differences. For a radius-three stencil with tuned coefficients, derivatives 
can be computed to a relative precision of 0.5% up to a Nyquist-scaled wavenumber of 
k = 0.5. Without tuning, this would be k = 0.34 for a radius-three stencil. A radius-one 
stencil derivative such as is used in Zeus (Stone & Norman 1992a) is only accurate up to 
k = 0.12. The spectral derivative is precise up to k = 1.00, although in practice it is limited 
to k = 0.94 because of aliasing. Aliasing limits a finite-difference code to k — 0.66 unless 
the finite-difference grid shift aliasing correction is used (§ 12.70 . 

Hyperresistivity is desirable because it is more effective than Laplacian resistivity in dif- 
fusing high-wavenumber modes while at the same time preserving low-wavenumber modes. 
The fact that hyperresistivity can be written as a curl allows its inclusion into CT. If Lapla- 
cian diffusivity were used instead, too much high-wavenumber structure would be diffused 
for the high-order or tuned derivatives to matter. 

The resolution of the algorithm described here approaches that of a spectral code, but 
because it uses finite differences, it runs faster than a spectral code and isn't restricted 
to periodic boundary conditions. Also, since the finite differences are local, it is easily 
scalable to thousands of processors. The spectral algorithm is more difficult to scale to large 
numbers of processors because it involves all-to-all communications between processors. A 
finite difference code only passes information between processors whose subgrids are adjacent 
in physical space. Lastly, because the code works with the magnetic field rather than the 
vector potential, boundary conditions are often easier to implement. 

We received support for this work from NSF Career grant AST99-85392, NSF grants 
AST03-07854 and AST06-12724, and NASA grant NAG5-10103. We acknowledge stimu- 
lating discussions with E. Blackman, A. Brandenburg, B. Chandran, and J. Stone, and we 
also acknowledge the referee, Wolfgang Dobler, for thorough comments that improved the 
paper." 
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A. Appendix 

Constrained transport expresses the magnetic induction equation as a pure curl plus a 
divergence diffusivity: SjB = V x F + ?y D VV ■ B, where F is defined in equation (jSJ). The rj D 
term serves to diffuse away magnetic divergence, and the finite differences are arranged so 
that V • V x F = 0. Thus, the curl term does not contribute to the evolution of the magnetic 
divergence, and if the initial conditions are divergence-free, the magnetic divergence remains 
zero throughout the evolution. 

To see how constrained transport works, denote the vector field by F iJik = {F? jik , F^ jk , F*j k }, 
where {i,j, k} are integers specifying the locations of grid cell centers. There are two basic 
grid types: "centered" and "staggered" (figure [6]). For a centered grid, scalar and vector 
quantities are located at cell centers. For a staggered grid, scalar quantities are located at 
cell centers and vector quantities at cell faces. For instance, we would index the components 
of F as {F ]+1/2 j k , F t j +1 / 2 h , F. j k+1/2 \. 

The finite divergence divergence of the curl of F is V • V x F = eijkdidjFk, which 
consists of terms such as {di&2 — <92<9i)Ffc. One can straightforwardly see that this is zero for 
finite differences of the form eq. [11] for both centered and staggered grids. Thus, constrained 
transport can be coordinated with high-order and tuned finite differences, as well as with 
hyperresistivity. 

For a staggered grid, the "F" vectors are located at cell edges, whereas the V and B 
vectors from which they are constructed are found at cell faces. A staggered grid CT scheme 
therefore involves spatial interpolation, one example being the method of characteristics 
scheme for time-interpolating Alfven waves. We use volume-centered fields and Runge-Kutta 
timestepping because, among other reasons, no interpolation is required. 
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Fig. 3. — The Alfven wave profile after ten crossing times, with (a) the Pencil code using 
the vector potential form of the induction equation and (b) using CT (§ 13. 2p . Alfven units 
denote the fluctuating field strengths compared to the uniform component of the magnetic 
field. 




Fig. 4. — We plot the RMS magnetic field strength for a turbulent dynamo simulation using 
the Pencil code in vector potential and CT mode (§ 13. 2(1 . Both techniques agree to 1 percent 
after 30 crossing times. 
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Fig. 5. — The oblique Alfven wave test from § 13.31 We plot the B z field as a function 
of y, with x and z equal to zero. We show initial conditions (solid points), and results 
from simulations with polynomial finite differences ( open circles ) and tuned finite differences 
(crosses). In each simulation, the grid size is 16 3 and the wave has traveled 16 times around 
the periodic box. The tuned finite differences yield a more precise result. 
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Centered grid 



Staggered grid 



Fig. 6.— We show the configuration for zone-centered and zone-staggered fields in two 
dimensions, where the generalization to three dimensions is straightforward. (Left) A zone- 
centered vector field, where both the X and Y vectors reside at zone centers. (Right) A 
zone-staggered vector field, where X vectors reside on X faces and Y vectors reside on Y 
faces. 



